28 research outputs found

    Immersive Insights: A Hybrid Analytics System for Collaborative Exploratory Data Analysis

    Full text link
    In the past few years, augmented reality (AR) and virtual reality (VR) technologies have experienced terrific improvements in both accessibility and hardware capabilities, encouraging the application of these devices across various domains. While researchers have demonstrated the possible advantages of AR and VR for certain data science tasks, it is still unclear how these technologies would perform in the context of exploratory data analysis (EDA) at large. In particular, we believe it is important to better understand which level of immersion EDA would concretely benefit from, and to quantify the contribution of AR and VR with respect to standard analysis workflows. In this work, we leverage a Dataspace reconfigurable hybrid reality environment to study how data scientists might perform EDA in a co-located, collaborative context. Specifically, we propose the design and implementation of Immersive Insights, a hybrid analytics system combining high-resolution displays, table projections, and augmented reality (AR) visualizations of the data. We conducted a two-part user study with twelve data scientists, in which we evaluated how different levels of data immersion affect the EDA process and compared the performance of Immersive Insights with a state-of-the-art, non-immersive data analysis system.Comment: VRST 201

    Molecular and Evolutionary Bases of Within-Patient Genotypic and Phenotypic Diversity in Escherichia coli Extraintestinal Infections

    Get PDF
    Although polymicrobial infections, caused by combinations of viruses, bacteria, fungi and parasites, are being recognised with increasing frequency, little is known about the occurrence of within-species diversity in bacterial infections and the molecular and evolutionary bases of this diversity. We used multiple approaches to study the genomic and phenotypic diversity among 226 Escherichia coli isolates from deep and closed visceral infections occurring in 19 patients. We observed genomic variability among isolates from the same site within 11 patients. This diversity was of two types, as patients were infected either by several distinct E. coli clones (4 patients) or by members of a single clone that exhibit micro-heterogeneity (11 patients); both types of diversity were present in 4 patients. A surprisingly wide continuum of antibiotic resistance, outer membrane permeability, growth rate, stress resistance, red dry and rough morphotype characteristics and virulence properties were present within the isolates of single clones in 8 of the 11 patients showing genomic micro-heterogeneity. Many of the observed phenotypic differences within clones affected the trade-off between self-preservation and nutritional competence (SPANC). We showed in 3 patients that this phenotypic variability was associated with distinct levels of RpoS in co-existing isolates. Genome mutational analysis and global proteomic comparisons in isolates from a patient revealed a star-like relationship of changes amongst clonally diverging isolates. A mathematical model demonstrated that multiple genotypes with distinct RpoS levels can co-exist as a result of the SPANC trade-off. In the cases involving infection by a single clone, we present several lines of evidence to suggest diversification during the infectious process rather than an infection by multiple isolates exhibiting a micro-heterogeneity. Our results suggest that bacteria are subject to trade-offs during an infectious process and that the observed diversity resembled results obtained in experimental evolution studies. Whatever the mechanisms leading to diversity, our results have strong medical implications in terms of the need for more extensive isolate testing before deciding on antibiotic therapies

    Style du génome exploré par analyse textuelle de l'ADN

    No full text
    DNA sequences can be considered as texts write in a 4-letters alphabet. A technique inspired from textual data analysis characterizes these sequences by short oligonucleotide (or word) frequencies. The whole word frequencies is called “genomic signature” (the “signature” term is justified because this set is species-specificity). Since the genomic signature can be observed in DNA segments as short as 1Kb, it appears to result from a “writing style” that characterizes the organization of DNA all over each genome. Moreover, proximities between species from the genomic signature point of view often correspond to proximities from the taxonomic point of view. However, the genomic signatures analysis is quickly confronted with limitations due to the curse of dimension. Indeed, the high dimensional data (the genomic signature generally has 256 dimensions) show unusual properties. For example, the concentration of Euclidean distances phenomenon is well known.From these observations, we set up procedures to evaluate metrics in order to emphasize biological information extractable from genomic signatures. A associated non-linear method for vicinities' representation frees from the curse of dimension and allows to visualize space occupied by data. The analysis of relations between signatures poses the problem of the contribution of each variable (the words) to the distance between signatures. An original Z-score based on the variation of word frequencies along genomes make it possible to quantify these contributions. The comparison between “local signatures” permit to extract original regions. Besides, the precise segmentation of original regions is computed thanks to a method based on signal analysis.From this set of methods, we can propose diverse biological results. In particular, we highlight an organization in the genomic signatures space coherent with species taxonomy. Moreover, we note the presence of a “DNA syntax” : there are “syntactic words” and “semantic words”. The signature is especially based on syntactic words. Lastly, the analysis of signatures along genome allows detection and precise segmentation of RNA and probable horizontal transfers. The convergence of the horizontal transfer styles towards host signature can besides be observed.Diverse kind of results was obtained by signature analysis. Thus, ease of use and speed of the genomic signature analysis make it a powerful tool to extract biological information from genomes.Les séquences d'ADN peuvent être considérées comme des textes écrits dans un alphabet de 4 lettres. Des techniques inspirées de l'analyse textuelle permettent donc de les caractériser, entre autres à partir de fréquences d'apparition de courtes suites de caractères (les oligonucléotides ou mots). L'ensemble des fréquences des mots d'une longueur donnée est appelé « signature génomique » (cet ensemble est spécifique de l'espèce, ce qui justifie le terme de « signature »). La signature d'espèce est observable sur la plupart des courts fragments d'ADN, ce qui donne à penser qu'elle résulte d'un « style d'écriture ». De plus, la proximité entre espèces du point de vue de la signature génomique correspond bien souvent à une proximité en terme taxonomique. Pourtant, l'analyse des signatures génomiques se confronte rapidement à des limitations dues à la malédiction de la dimension. En effet, les données de grande dimension (la signature génomique a généralement 256 dimensions) montrent des propriétés qui mettent en défaut l'intuition. Par exemple, le phénomène de concentration des distances euclidiennes est bien connu.Partant de ces constatations, nous avons mis en place des procédures d'évaluation des distances entre signatures de façon à rendre plus manifeste les informations biologiques sur lesquelles s'appuient nos analyses. Une méthode de projection non-linéaire des voisinages y est associée ce qui permet de s'affranchir des problèmes de grande dimension et de visualiser l'espace occupé par les données. L'analyse des relations entre les signatures pose le problème de la contribution de chaque variable (les mots) à la distance entre les signatures. Un Z-score original basé sur la variation de la fréquence des mots le long des génomes a permis de quantifier ces contributions. L'étude des variations de l'ensemble des fréquences le long d'un génomes permet d'extraire des segments originaux. Une méthode basée sur l'analyse du signal permet d'ailleurs de segmenter précisément ces zones originales.Grâce à cet ensemble de méthodes, nous proposons des résultats biologiques. En particulier, nous mettons en évidence une organisation de l'espace des signatures génomiques cohérente avec la taxonomie des espèces. De plus, nous constatons la présence d'une syntaxe de l'ADN : il existe des « mots à caractère syntaxique » et des « mots à caractère sémantique », la signature s'appuyant surtout sur les mots à caractère syntaxique. Enfin, l'analyse des signatures le long du génome permet une détection et une segmentation précise des ARN et de probables transferts horizontaux. Une convergence du style des transferts horizontaux vers la signature de l'hôte a d'ailleurs pu être observée.Des résultats variés ont été obtenus par analyse des signatures. Ainsi, la simplicité d'utilisation et la rapidité de l'analyse des séquences par signatures en font un outil puissant pour extraire de l'information biologique à partir des génomes

    Style du génome exploré par analyse textuelle de l'ADN

    No full text
    PARIS-BIUSJ-Thèses (751052125) / SudocPARIS-BIUSJ-Physique recherche (751052113) / SudocSudocFranceF

    Daily forecast of solar thermal energy production for heat storage management

    No full text
    International audienceSolar energy offers a renewable source of power but its fluctuating nature raises concerns about the electrical grid balancing. Network regulators have to estimate the upcoming production to match supply with demand; consequently, power plant operators may be asked to provide accurate forecasts. Planning the thermal or electrical output of solar power plants is thus highly required to ensure a stable power chain supply. This paper presents a solution that couples a meteorological model with a solar power plant performance model. The power output is predicted 24 h ahead in the case of a solar Fresnel power plant. The required Direct Normal Irradiance is inferred from the global horizontal irradiance; the thermal production is evaluated from an optical and thermal model. Our approach has been validated on a 1000 m 2 Fresnel power plant, paving the way for model-based storage strategy
    corecore